Solving Extremely Large Stochastic Mixed-Integer
Background and Purpose
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
!$!7F32<B353@ 7<3/@$@=5@/;;7<5 ;7<7;7H3A=@;/F7;7H3A/:7<3/@4C<1B7=<
7AAC0831BB=:7<3/@1=<AB@/7<BA 6/A7<B353@/<21=<B7<C=CAD/@7/0:3A
/ General form of
1=;07</B=@7/:=>B7;7H/B7=<>@=0:3;A / !/<G/>>:71/B7=<A
/ !$A=:D/07:7BG6/A033<7;>@=D7<5
min{c
�x : Ax ≤ b, l ≤ x ≤ u, x
j∈ Z, for all j ∈ I}
A ∈ R
m×n, b ∈ R
m, c, l, u ∈ R
n, I ⊆ { 1, . . . , n } min{c
�x : Ax ≤ b, l ≤ x ≤ u, x
j∈ Z, for all j ∈ I}
A ∈ R
m×n, b ∈ R
m, c, l, u ∈ R
n, I ⊆ { 1, . . . , n }
'7'.12/'051(#/#44+7'.:2#3#..'.41.7'3
1/<A=:D37<AB/<13A B6/B 1/<<=B 03 A=:D32 0G AB/B3=4B63/@B!$ A=:D3@A 933>A1/B167<5C>>3@4=@;/<137;>@=D3;3<BA=4AB/B3=4B63/@B!$A=:D3@A
Stochastic programming models optimization problems involving uncertainty An approach: Optimize minimum expected cost
Typically: Decisions are staged
Consider two-stage stochastic mixed-integer programs (SMIPs):
1st stage: deterministic “now” decisions
2nd stage: depends on random event & first stage decisions.
Cost function includes deterministic variables & expected value function of non-deterministic parameters
Stochastic Mixed Integer Programming
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
3F/;>:3
Using sample average approximation (SAA), can be approximated using deterministic equivalent formulations
This assumption yields characteristic dual block-angular structure.
Stochastic MIPs and their deterministic equivalent
A T1W1 T2 W2
..
. . . .
TN WN
minctx
x0 x1 x2 .. . xN
b0 b1 b2 .. . bN
≤
s.t.
Common constraints Independent realization scenarios
}
A T1
T2
TN
... ...
W1
W2
WN
1.+0-+0)%10453#+054 *'4#/'$.1%-4+;'
PIPS-SBB: a distributed-memory parallel solver for deterministic equivalent SMIPs
PIPS-SBB is a specialized Branch-and-Bound based solver for two-stage SMIPs Distributed Memory approach: Distribute data among multiple processors.
PIPS-S: Distributed Memory parallel Simplex solver for Stochastic LPs (Lubin et al., 2013) PIPS-SBB: SMIP-specific cuts, heuristics, branching rules, etc.
Current progress: CP18 Integer and Combinatorial Optimization – Part I (Wed. 17:30) PIPS-SBB: A distributed-Memory Linear-Programming-Based Branch-and-Bound Solver for Stochastic Mixed-Integer Programs given by Geoffrey Malcolm Oxberry
...A T1 T2
TN
... ...
...
W1 W2
WN
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
Branch and Bound for solving MIPs
Branch and Bound (B&B): Most widely used algorithms for solving MIPs to optimality Upper Bounds (UB) are provided by the integer solutions found along the B&B
exploration
Each subproblem has a Lower Bound (LB) associated with the LP relaxation Upper bound (UB)
Lower bound (LB) UB−LB
UB ·100 GAP(%)
| |
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
Branch and Bound is straightforward to parallelize in theory
Processing of subproblems is independent Shared-memory parallelization present in most
state-of-the-art MIP solvers
Distributed-memory parallelization is not as straightforward Need to distribute work among different nodes
Information sharing results in communication overheads Classification: Smallest transferrable unit of work
Coarse-grained parallelization: Unit of work is MIP subtree Fine-grained parallelization: Unit of work is MIP tree node
Branch and Bound is straightforward to parallelize in theory
Processing of subproblems is independent
Processing of nodes is a sequential computational bottleneck
PIPS-SBB: B&B based Stochastic MIP solver Processing of nodes in parallel (parallel LP relaxation,
parallel heuristics, parallel problem branching, …).
...
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
'44#)'#44+0)05'3(#%'
Branch and Bound is straightforward to parallelize in theory
Processing of subproblems is independent
Processing of nodes is a sequential computational bottleneck
PIPS-SBB: B&B based Stochastic MIP solver Processing of nodes in parallel (parallel LP relaxation,
parallel heuristics, parallel problem branching, …).
Parallelized B&BPIPS-SBB: Two levels of parallelism Branch and Bound in parallel (coarse/fine)
Coarse-grained: ug[PIPS-SBB, MPI]
Fine-grained: PIPS-PSBB
...
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
'44#)'#44+0)05'3(#%'
Parallel Branch and Bound as a graph problem
We can regard parallel Branch and Bound as a parallel graph exploration problem Given p processors, frontierof a tree is the set of subproblems currently being open Subset currently processed in parallel are the activenodes
Redundantnode: A subproblem fathomable given that the optimal solution is known.
Goal: Increase efficiency of Parallel B&B by reducing number of useless nodes explored.
Parallel Branch and Bound
To reduce the amount of useless nodes explored, the search must Fathom subproblems using high quality UBs
How to focus on the most promising nodes To increase the parallel efficiency by:
Generating a set of active nodes comprised of the most promising nodes Employing processors to explore the smallest amount of active nodes With less communication overhead
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
Coarse-grained Parallel Branch and Bound: MIP subtrees
Advantages:
Less communication between MPI processes Centralized implementations simpler to implement Challenges:
Extra work likely to be performed when compared to the sequential case May scale poorly due to load imbalance
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
Course-grained approach to Parallel Branch and Bound using UG
Ubiquity Generator Framework(UG) is a generic framework to parallelize B&B solvers Exploits powerful performance of state-of-the-art base solvers, such as SCIP, Xpress, Gurobi, CPLEX External parallelization using Sub-MIPs (Unit of work: MIP subtree)
Solving techniques involved in SCIP
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
/ MILP (Mixed Integer Linear Programming) Problem
/ Parallel solving with mathematically supercharged algorithms
/ Finite, but exponential search space
B&B: implicit enumeration for global optimality / UG: parallelization framework for state-of-the-art
B&B algorithms (e.g. for MINLP, etc.) / Parallelize SCIP: Solving Constraint Integer Programs / Parallelize commercial solvers
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
min{c�x:Ax≤b, l≤x≤u, xj∈Z,for allj∈I}
A∈Rm×n, b∈Rm, c, l, u∈Rn, I⊆ {1, . . . , n}
A#@757</:
AC0>@=0:3; A’$@3A=:D32 AC0>@=0:3;
min{c���x:Ax≤b, l≤x≤u, xj∈Z,for allj∈I}
A∈Rm×n, b∈Rm, c, l, u∈Rn, I⊆ {{1, . . . , n}}
exponential sea 74471C:B G</;71:=/2
0/:/<17<5
UG Ubiquity Generator Framework - 1
UG Ubiquity Generator Framework - 2
!(3#/'813-)A7<5$B= 1=<B@=:
A=:D7<5/:5=@7B6;A )A7<5$B=1=<B@=:
A=:D7<5/:5=@7B6;A )A7<5$B=1=<B@=:
A=:D7<5/:5=@7B6;A
/A3A=:D3@ /A3A=:D3@ /A3A=:D3@
)A7<5=@25*3'#&4
4=@1=;;C<71/B7=<A )A7<5=@25*3'#&4
4=@1=;;C<71/B7=<A )A7<5=@25*3'#&4 4=@1=;;C<71/B7=<A A6/@32;3;=@G
C5,25*3'#&4-703@'$
C5,"23'4425*3'#&4-703@*>@3AA
=/2A/@31==@27</B320G/A>317/:>@=13AA=@B6@3/2 /A3A=:D3@
#>@3A=:D3
27AB@70CB32;3;=@G C5,-$/@/'$
C5,"23'44-$/@/*>@3AA
#3#..'.
1.7'3 045#05+#5+10 =/2=@@27</B=@
)A7<5$B=1=<B@=:
A=:D7<5/:5=@7B6;A
/A3A=:D3@
)A7<5=@25*3'#&4 4=@ 1=;;C<71/B7=<A
!1.7'3
95'30#.
2#3#..'.+;#5+10
UG Ubiquity Generator Framework - 3
Main features provided by UG Ramp-up mechanism
Normal ramp-up, racing ramp-up Dynamic load balancing
Checkpointing and restarting mechanism Deterministic mode for debugging
Some facts about ug[SCIP, MPI]: ParaSCIP
Solved 2 MIPLIB2003 previously unsolved instances for the first time Solved 12 MIPLIB2010 previously unsolved instances for the first time ug[SCIP-Jack, MPI] solve three previously unsolved instances
from PUC Steiner Tree Problem test set
SCIP-Jack: A customized SCIP solver for Steiner Tree Problem
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
Main results for MIP solving by ParaSCIP
"=D3;03@B@7>B7;E/AA=:D32E7B61=@3A7<6=C@A 313;03@@;7<3E/AA=:D32E7B6@3AB/@B328=0A
(7B/<@/G*#>B3@=< H@/G3;7<77<B3@1=<<31B '!C87BAC$&!&+&*'
&"':B7F **3=< %H *H &"@/G*<B3:*3=<DH@73A7<B3@1=<<31B
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
0303@ 3333333 /A A= 32 B 3AB/ B32 8=0A
B/< @/G*#>B3@=< H @/G3;7<77<B3@1=<<31B '! C87BAC $&!&+ &*'
AA=:D32E7B61=@3A7<6=C@A 3
3 3 3
3E/AA=:D32E7B6@3AB/@B328=0A
The biggest and the longest computation
Solving rmine10: 48 restarted runs with 6,144 to 80,000 cores
1 100 10000 1x106 1x108 1x1010
0 10000 20000 30000 40000 50000 60000 70000 80000 90000
Number of Nodes Number of Active Solvers + 1
# nodes left
# active solvers
# nodes in check-point file -1916
-1915.5 -1915 -1914.5 -1914 -1913.5 -1913 -1912.5 -1912
0 1x106 2x106 3x106 4x106 5x106 6x106 7x106
Objective Function Value
Computing Time (sec.) Incumbents
Global LBs
=EC>>3@/<2:=E3@0=C<2A3D=:D32
80000 90000
+ 1# nodes left
# active solvers
# d i h k i t fil x106 2x106 3x106 4x106 5x106 6x106 7x106
Computing Time (sec.) Incumbents
Global LBs
EC>>3@/<2:=E3@0=C<2A3D=:D32
(7B/<E7B61=@3A (63=B63@A &"
5511-#$165&#:4
#0&:'#341(
!%13'*1634
*#0&.'62!%#0
51
231%'44
Performance of state-of-the-art MIP Solvers
!$A=:D3@03<16;/@9B6@3/2'674B3253=;3B@71;3/<=4@3AC:BAB/93<4@=;
B636=;3>/53=4 /<A!7BB3:;/<< >@ )<A=:D32=@4/7:327<AB/<13A /@3/11=C<B324=@E7B6B63B7;3:7;7B=46=C@A
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
>3@4=@;/<13C53 27443@3<13
'$'=:D7<5=<AB@/7<B<B353@$@=5@/;A
=<3=4B63;=AB>=E3@4C:<=<1=;;3@17/:!$A=:D3@
extendable to solve more large classes of
=>B7;7H/B7=<>@=0:3;A
B6@3/2
Solving open instances of MIPLIB2010
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
+3/@
"C;03@=4=>3<7<AB/<13AA=:D32 6BB> ;7>:70H7023
=>3<7<AB/<13A:34B
!$ E/A
>C0:7A632
&" &"
1=@3A1=@3A
"3E$/@/*>@3AA
Course-grained approach to Parallel Branch and Bound using UG
Ubiquity Generator Framework(UG) is a generic framework to parallelize B&B solvers Exploits powerful performance of state-of-the-art base solvers, such as SCIP, Xpress, Gurobi, CPLEX External parallelization using Sub-MIPs (Unit of work: MIP subtree)
Base solver and communication library used are encapsulatedby UG:
ug[PIPS-SBB,MPI]
Course-grained approach to Parallel Branch and Bound using UG
Dynamic load balancing using Lower Bounds (quality of “goodness”):
LoadCoordinator (LC) maintains queue of “good” nodes LC Requests nodes from “good” solvers when queue size is small LC hands out node from queue to solver when requested Solver requests “good” node when current nodes are not “good”
Other details:
Multiple Ramp-up mechanisms: Normal ramp-up, racing ramp-up Checkpointing and restarting mechanism
Deterministic mode for debugging
Capable of handling distributed-memory based base solver
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
Course-grained approach to Parallel Branch and Bound using UG
mpirun –np 10 ./ugpipssbbSMPS … -npipsbk 3 E/7B7<5
@C<<7<5 .1$#.+0%6/$'05
41.65+10
1#&113&+0#513#0-!1.7'3
#0- !1.7'3
#0- !1.7'3
#0-$$''&/<9
$$''&/<9
$$''&/<9
$$''&/<9
$$''&/<9
$$''&/<9
$$''&/<9
$$''&/<9
$$''&/<9
'�-!41.7'33#0-$$'' &/<9
$$''&/<9
$$''&/<9
'C0!$1=;>CB/B7=<7A>3@4=@;32E7B6;C:B7>:3!$>@=13AA3A
"332B=AG<16@=<7H3E63</)'=:D3@1=;;C<71/B3E7B6 k 3
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
Fine-grained Parallel Branch and Bound: MIP tree nodes
The smallest transferrable unit of work is a Branch and Bound node.
Queues in processors are collections of subtrees (frontier nodes) Advantage: Allows for great flexibility and fine-grained control
Disadvantage: More communication
Solution: Coordination and exchange is decentralized to minimize communication
Fine-grained control: Share the most promising nodes Solution: All-to-all parallel node exchange of best active nodes
Load balancing is maintained via synchronous MPI collective communications
Lower boundsof most promising K x N nodes of all processors are exchanged and ranked
The top K x N are selected and nodesare redistributed in a round robin fashion Because of fine-grained nature of the
approach, communication must be used strategically to minimize overheads Status of each solver (Upper/lower bounds, tree
sizes, times, solutions, …) exchanged asynchronously
Node transfers are synchronous Exchanges are triggered judiciously
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
0 1 2 3 5 6 8 15 16 4 7 9 10
12
11 13 14 17 18 19 Solver 0
0 1 2 3 5 6 8 15 16 4 7 9 10
12
11 13 14 17 18 19 Solver 2
0 1 2 3 5 6 8 15 16 4 7 9 10
12
11 13 14 17 18 19 Solver 0
0 1 2 3 5 6 8 15 16 4 7 9 10
12
11 13 14 17 18 19 Solver 1
0 1 2 3 4 5 6 7 8
Solver 0
0 1 2 3
Solver 1 Solver 2
5 6 8
4 7 9 10 11 12
14 13
Gather top K · N · N bounds (K nodes · N solvers · N solvers)
Solver 0
0 3 6
Solver 1 Solver 2
2 5 8
11 10 12 Redistribution of
top K · N nodes
1 4 7 9
13 14
16 20 21
15 22
18
17 19
Sort, and select top K · N bounds
K=3, N=3
16 20 21
15 22
18
17 19
Solver 2
0 1 2 3 4 5 6 7 8 Solver 1
0 1 2 3 4 5 6 7 8 Solver 0
0 1 2 3 5 6 8 15 16 4 7 9 10
12
11 13 14 17 18 19 Solver 0
0 1 2 3 5 6 8 15 16 4 7 9 10
12
11 13 14 17 18 19 Solver 0
n Node estimation/bound n Node information
Fine-grained approach to Parallel Branch and Bound
Two levels of parallelism require a layered organization of MPI processors In the PIPS-S communicator,
processors perform in parallel:
Presolve, LP relaxations
Primal Heuristics, Cuts, Branching and Node Selection
In the Branch and Bound communicator, processors exchange:
B&B Nodes, Lower Bound information Solutions, Queue sizes, and Search status Two Strategies for Ramp-up:
Parallel Strong Branching Standard Branch and Bound
Strategy for Ramp-down: intensify the frequency of node rebalancing
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
We test our solver on SSLP instances, from the SIPLIB library SSLP instances model server locations under uncertainty
Instances are coded as SSLPm_n_s, where s represents the number of scenarios Larger number of scenarios means bigger problems
LP relaxations of all instances fit in memory, even in CPLEX PIPS-SBB can handle much larger LP relaxations
Details: See http://www2.isye.gatech.edu/~sahmed/siplib/sslp/sslp.html Larger number of scenarios means bigger problems
LP relaxations of all instances fit in memory, even in CPLEX PIPS-SBB can handle much larger LP relaxations
Computationally intensive features MIP Heuristics: on, Strong Branching: off All runs on the Cab cluster at LLNL:
Each node: Intel Xeon E5-2670, 2.6 GHz, 2 CPUs x 8 cores/CPU 16 cores/node, 2 GB RAM/core, 32 GB RAM/node Infiniband QDR interconnect
Experimental performance results: Setup
We measure parallel performance in terms of wall clock time, speedup, communication overhead, number of nodes processed and primal dual integral
Wall clock time
Speedup: Fraction of time Tpneeded to reach optimality by a configuration with p processors with respect to the time needed by a sequential baseline T1:
Communication overhead (Parallel PIPS-SBB): Fraction of time Tcomm+ Tsyncneeded for communication and processor synchronization with respect to the total time of execution Texec:
Idle time ratio (ug[PIPS-SBB,MPI]): Fraction of average idle time with respect to execution Texec:
Node efficiency:Fraction of redundant nodes explored Nr with respect to the total number of nodes explored Ntotal.
Experimental performance results:
Measuring parallel performance
$
$ "
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
2:3B7;3@/B7=$#
!
$
$+(%(#$%#&(!)%$
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
Performance comparison between PIPS-PSBB and ug[PIPS-SBB,MPI] when optimizing small instances.
sslp_15_45_5 (5 scenarios, 3390 binary variables, 301 constraints)
$$'$'
o '1/:3AC>B=1=@3AF o (=B/:E=@9>3@4=@;32@3;/7<A
E7B67</4/1B=@=4FE@B A3?C3<B7/:
o =;;C<71/B7=<=D3@63/2 2=;7</B3A/4B3@1=@3A o "=237<3447173<1G5@=EA/B/
A:=E3@@/B3B6/<C5,$$' '!$- C5,$$''!$-o A/4/1B=@=4FA:=E3@
o (=B/:E=@9D/@73A0G>@=13AA=@
1=<475C@/B7=<
o 7563@1=;;C<71/B7=<=D3@63/2 /<267563@<=237<3447173<1G 0
10 20 30 40 50 60 70
Communicationoverhead(%)
1 2 10 50 100200400
Number of Processors 0
10 20 30 40 50 60 70
Scaling:Timetooptimality
1 2 10 50 100 200400
Number of Processors
100000 200000 300000 400000 500000 600000 700000
Totaltreesize
1 2 10 50 100 200400
Number of Processors 0 10 20 30 40 50 60 70
Nodeinefficiency(%)
1 2 10 50 100200400
Number of Processors
PIPS-PSBB ug[PIPS-SBB,MPI]
+$!$* %##+$!*!%$('+$.%
/ PIPS-PSBB allows to modify the frequency between synchronous communications.
/ Frequency defined with (x,y), where xand yrepresent the minimum and maximum number of B&B iterations that must be processed before communication takes place.
/ Tighter communication increases communication overheads, but reduces work performed.
0 10 20 30 40 50 60 70 80 90
Communicationoverhead(%)
1 2 10 50 100 200 400 Number of Processors 0
10 20 30 40 50 60 70
Scaling:Timetooptimality
1 2 10 50100 200 400 Number of Processors
150000 200000 250000 300000 350000 400000 450000 500000
Totaltreesize
1 2 10 50100 200 400 Number of Processors Tight Communication (10-500) Standard Communication (50-1000) Loose Communication (100-50000)
%",(&(%(#$-&%)))"&
)$(!%)!$(.,(!") %$)*(!$*)
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
0 100 200 300 400 500 600
RootnodeLPRelaxationTime(s)
1[500] 5[100] 10[50] 20[25] 25[20]50[10]100[5]500[1]
Number of solvers[Processors per PIPS-S solver]
0 2e+06 4e+06 6e+06 8e+06 1e+07 1.2e+07
Totaltreesize
1[500 ] 5[100
]
10[50] 20[25] 25[20] 50[10] 100[5] 500[1]
Number of solvers[Processors per PIPS-S solver]
0 5 10 15 20 25
Communicationoverhead(%)
1[500] 5[100] 10[50]
20[25]
25[20]
50[10]
100[5]
500[1]
Number of solvers[Processors per PIPS-S solver]
0 2 4 6 8 10 12 14 16 18
Rampupcommunicationoverhead(%)
1[500] 5[100] 10[50] 20[25] 25[20] 50[10] 100[5] 500[1]
Number of solvers[Processors per PIPS-S solver]
$$'/ '>332C>B=1=@3A7AF / $3@4=@;/<137<1@3/A3AC>B=
1=@3A
$$'$'
/ =;;C<71/B7=<=D3@63/2
;7<7;/:3F13>B/B@/;>C>
E63< $A=:D3@7AA:=E
%#&(!)%$!$)*
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
Instance Scenarios Configuration PIPS-PSBB ug[PIPS-SBB,UG] CPLEX SM CPLEX DM
Solvers PIPS-S procs
GAP(%) (Time)(s) GAP(%) (Time)(s) Procs GAP(%) (Time)(s) Procs GAP(%) (Time)(s)
sslp_5_25_50 50 2 2 (7.45s) (8.03s) 4 (0.27s) 4 (0.27s)
sslp_5_25_100 100 2 2 (22.37s) (17.79s) 4 (0.64s) 4 (0.64s)
sslp_15_45_5 5 200 2 (107.11s) (163.53s) 16 (1.97s) 400 (6.26s)
sslp_15_45_10 10 200 2 0.09% 0.16% 16 (1.81s) 400 (15.04s)
sslp_15_45_15 15 200 2 0.25% 0.30% 16 (7.80s) 400 (15.75s)
sslp_10_50_50 50 200 10 0.13% 0.21% 16 (43.88s) 2000 0.15%(M)
sslp_10_50_100 100 200 10 0.17% 0.20% 16 (221.69s) 2000 0.16%(M)
sslp_10_50_500 500 200 10 0.24% 0.24% 16 4.91%(M) 2000 1.25%(M)
sslp_10_50_1000 1000 200 10 0.24% 0.24% 16 9.91% 2000 6.08%
sslp_10_50_2000 2000 200 10 0.26% 0.26% 16 19.93% 2000 8.11%
Time limit: 1 hour
/ Distributed-memory parallelization of CPLEX is often inferior to its shared-memory counterpart.
/ Both CPLEX versions run into Memory limits for some problems.
/ The superior performance of CPLEX’s base solver helps in trivial and small problems.
/ PIPS-SBB-based solvers show superior performance for large problems.
Performance comparison against CPLEX 12.6.2
UG Synthesizer
)''=:D3@A<!$1=;;C<71/B=@4=@)'7A>@=D7232 C5A
!327/B3 A=:CB7=<
A6/@7<5
$.*>@3AAC5A 1=<475 A=:D3@A
3C@7AB71A=:D3@A can be distributed
/'/13:41.7'34$1.&'&)'
<
$/@/*>@3AAC5A 1=<475$/@/'$C5A
1=<475
<
1=<475*>@3AAC5A 1=<475C@=07C5A 1=<475$ *C5A<
>/@/::3:>@=13AA3A
>/@/::3:>@=13AA3A
1=<475C@/B7=<47:3 A1@7>BB=
53<3@/B3@C<B7;3 3<D7@=<;3<B ;/93AG;0=:71:7<9A 53<3@/B3/<;>7@C< 1=;;/<2
4=@/<!$!!$>@=5@/;
&C<B7;3>@=13AA3A
=</AC>3@1=;>CB3@
)''=:D3@A<!$1=;;C<71/B=@4=@)'7A>@=D7232 C5A
!327/B3 A=:CB7=<
A6/@7<5
>
$.*>@3AAAC5A 1=<475 1=<475 A=:D3@A
3C@7AB71A=:D3@A can be distributed
/'/13:41.7'34$1.&'&)'
<
<
$/@/*>@3AA$/@/*>@3AAC5A 1=<475 1=<475$/@/'$C5A 1=<475
1=<475
< < <
1=<475*>@3AAC5A55b di t b A
C@=0C5A 5 1=<4755
>>
$$
t ib t dd t ib 07
$ *C5A 5
1=<4755
< <
> >
PAC: L. M unguia, S. Ahmed, D. Bader, G.L. Nemhauser and Y. Shao. ‘‘Alternating Criteria
Conclusions
2D/<13;3<B=4;/B63;/B71/:;=23:=427A/AB3@>@3D3<B7=</<23D/1C/B7=<>:/<<7<5B=E/@2A=17/:7;>:3;3<B/B7=<
We developed a light-weight decentralized distributed memory branch-and-bound implementation for PIPS-SBB with two degrees of parallelism:
Processing of nodes in parallel (parallel LP relaxation, parallel heuristics, parallel problem branching, …).
Branch and Bound in parallel.
Better parallel efficiency is achieved by focusing the parallel resources in the most promising nodes.
We try to reduce communication bottlenecks and achieve high processor occupancy via a decentralized control of the tree exploration and a lightweight mechanism for exchanging Branch and Bound nodes.
/ Superior performance to state-of-the-art commercial MIP solvers, in the context of large instances.
The SCIP Optimization Suite
Matthias Miltenberger
Zuse Institute Berlin
26thFebruary, 2015
Matthias Miltenberger (ZIB): The SCIP Optimization Suite 1
The SCIP Optimization Suite
◮ toolbox forgeneratingandsolvingconstraint integer programs
◮ free for academic use, available in source code
ZIMPL
◮ model and generate LPs, MIPs, and MINLPs